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CN ; Abstract 

A polar decomposition of mutual information between a complex-valued channel's input and output 

is proposed for a input whose amplitude and phase are independent of each other. The mutual information 

is symmetrically decomposed into three terms: an amplitude term, a phase term, and a cross term, 

whereby the cross term is negligible at high signal-to-noise ratio. Theoretical bounds of the amplitude 

and phase terms are derived for additive white Gaussian noise channels with Gaussian inputs. This 

decomposition is then applied to the recently proposed amplitude phase shift keying with product 

constellation (product-APSK) inputs. It shows from an information theoretical perspective that coded 

modulation schemes using product-APSK are able to outperform those using conventional quadrature 

amplitude modulation (QAM), meanwhile maintain a low complexity. 
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I. Introduction 
For a complex-valued channel, the channel input and output are usually complex-valued 



^ signals. Traditionally, the input signal X is decomposed into its real and imaginary parts 



X = Xj + jX Q , 

where j = \f—l, and Xj and Xq denote the real and imaginary parts, also known as the in-phase 
(/) and quadrature (Q) parts, respectively. The output signal Y is decomposed as 

Y = Y I + jY Q , 
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where Yj and Yq denote the real and imaginary parts of Y. Thereafter, the mutual information 
between X and Y could be decomposed as 

I(X; Y) = I(X i; Yj) + I(X Q ; Y Q \X T , Y T ) + I(X i; Y Q \Y T ) + I(X Q ; Y^Xj) (1) 

based on the chain rule of mutual information [JH Theorem 2.5.2, Page 24]. 
Such decomposition (QQ) can be simplified as 

I(X; Y) = I(Xj; Yj) + I(X Q ; Y Q ) (2) 

when the following two conditions are satisfied. 

1) Xj and Xq are independent of each other, and 

2) the distortions introduced by the channel affect the real and imaginary parts independently. 
For example, for a rectangular quadrature amplitude modulation (QAM) input over additive 

white Gaussian noise (AWGN) channels, the channel can be decomposed into two sub-channels, 
namely, the real and imaginary sub-channels, or say, the / and Q sub-channels [|2l Page 278]. 
However, if either of the above two conditions is invalid, the simplified decomposition © no 
longer holds. For instance, when a high-order (higher than 4) phase shift keying (PSK) input 
signal is used, or the channel distortions are I-and-Q dependent, e.g., for systems clipping the 
amplitude caused by non-linear amplifiers, or systems that introduce phase noises. 

Most recently, Goebel et. al proposed a decomposition of mutual information based on the 
polar coordinate system, wherein the general case with an arbitrary input is considered, and the 
mutual information is decomposed into four terms: an amplitude term, a phase term, and two 
cross terms (called mixed terms therein) [3]. Such decomposition is helpful in understanding the 
characteristic of channels with phase noise, and as an example, partially coherent detection was 
studied therein for fiber-optic communications. 

In this correspondence, we investigate the decomposition for a special kind of input whose 
amplitude and phase are independent of each other, e.g., Gaussian inputs or product amplitude 
phase shift keying (product- APSK) inputs H|. Different from (3), with the property of indepen- 
dent amplitude and phase, we symmetrically decompose the mutual information into three terms: 
an amplitude term, a phase term, and a cross term. Rather than the approximations in [J3J , we 
derive theoretical bounds of the decomposed terms over AWGN channels for Gaussian inputs. 
We apply this decomposition into product-APSK inputs, and establish an information theoretical 
foundation to design and analyze product-APSK for coded modulation (CM) schemes. We show 
from an information-theoretical perspective that CM schemes using product-APSK are able to 
achieve better performance while maintain a low complexity, comparing with CM schemes using 
square QAM. Please note that the decomposition in [3, Equ. (3)] is nonsymmetric, and the phase 
term therein is still relevant to the amplitude of the input signal. 

It is worth emphasizing that conventionally it seems as if square QAM were the best choice for 
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practical systems [|5), H. Therefore, almost all the current communication systems use square 
QAM constellations to achieve high spectrum efficiency when the average transmit power is 
limited, including the long-term evolution (LTE) and its advanced version LTE-A Q, terrestrial 
digital video broadcasting (DVB-T) and its second generation DVB-T2 [8], wireless local area 
network (WLAN) standards and etc. Nevertheless, our recent experiments show that well- 
designed product- APSK outperforms square QAM in terms of error performance while maintains 
a low complexity flU, [|9]|, ifTOl . These experiments motivate us to seek for the information 
theoretical background for product-APSK. That is why we propose the polar decomposition of 
mutual information in this correspondence. 

As product-APSK is the motivation of our polar decomposition, we would like to introduce the 
road map of its development. APSK is an old modulation technique proposed several decades 
ago [11 J, where the radii are equally spaced to maximize the minimum Euclidean distance. 
Afterwards, owing to its low peak-to-average power ratio (PAPR), APSK has been optimized for 
satellite communications, e.g., the 2nd generation digital video broadcasting over satellite (DVB- 
S2) lfT2l . [fT3l . However, these APSK constellations are optimized for transmissions that is peak- 
power limited, while most communication systems are average-power limited. In addition, such 
optimization is target for independent demapping, i.e. without any feedback from the decoder 
to the demapper such as in traditional bit-interleaved coded modulation (BICM) schemes [14J, 
whereby QAM with Gray labeling (Gray-QAM) is much better than APSK when average-power 
rather than peak-power is limited. Furthermore, since the APSK labeling lacks a nice structure, 
the complexity of its demapper is higher than that of the Gray-QAM demapper. 

Nevertheless, inspired by [TT5l — [TT7l that in comparison with conventional QAM signals, shap- 
ing can be achieved using a constellation with non-uniformly spaced signal points, we showed 
that well-designed APSK is capable of providing a considerable shaping gain over complex- 
valued AWGN channels [9J. A basic explanation why APSK may obtain a shaping gain over 
QAM is that only the complex- valued Gaussian distribution achieves the complex-valued AWGN 
channel capacity when the average-power is limited. Complex Gaussian distribution is circularly 
symmetric, while square QAM is nonsymmetric but fortunately APSK is. Therefore, by properly 
assigning the non-uniformly spaced APSK points, the channel output using APSK would exhibit 
more complex-Gaussian like behavior than that using QAM. 

Furthermore, well-designed Gray-labeled APSK (Gray-APSK) also outperforms Gray-QAM 
in both independent and iterative demapping scenarios in the sense of error performance [10]. 
Iterative demapping refers to that iterations are taken between the demapper and decoder, e.g., in 
BICM-ID schemes OH, lO. The concept of Gray-APSK is extended to product-APSK in H, 
wherein simplified independent demappers are also derived, ensuring that our product-APSK not 
only outperforms its QAM counterpart in terms of error performance, but also maintains a low 
complexity. 
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As a beneficial application of the proposed polar decomposition of mutual information, this 
correspondence establishes an information theoretical foundation for product-APSK design and 
analysis. We shall see why product-APSK achieves better performance while maintains a low 
complexity from an information-theoretical perspective. 

The rest of this correspondence is organized as follows. We propose the polar decomposition 
of mutual information in Section HO Section [III] derives the decomposition for Gaussian inputs, 
where theoretical bounds are presented. In Section [V] we apply such decomposition to product- 
APSK inputs, which is beneficial for the product-APSK design as well as the construction of 
its simplified demapper. Section |V] provides the numeric results to verify our analysis for both 
Gaussian and product-APSK inputs. Finally, conclusions are drawn in Section fVIl 

For the sake of clarity, the following notations are employed throughout this correspondence. 
Upper-case calligraphic symbols denote sets, e.g., X. Symbols in boldface denote vectors, e.g., 
x. Upper-case symbols denote random variables (R.V.s), e.g., X, while the corresponding lower- 
case symbols denote their realizations, e.g., x. Px{x) is used for the probability of a discrete 
event of X = x, and px (x) is used for the probability density function (PDF) of a continuous 
R.V. X. P Y \x(y\x) represents the conditional probability of Y = y given X = x, and Py\x{v\x) 
represents the conditional PDF of Y given X = x. log(-) denotes the natural logarithm operation, 
and log 2 (-) denotes the base 2 logarithm operation. I(X;Y) denotes the mutual information 
between X and Y, and I(X; Y\Z) denotes the conditional mutual information between X and 
Y given Z. H(X) denotes the entropy of a discrete R.V. X, and H(Y\X) denotes the conditional 
entropy of Y given X. h(X) denotes the differential entropy of a continuous R.V. X, and h(Y\X) 
denotes the conditional differential entropy of Y given X. E[-] denotes the expectation operation, 
and Ej;[-] denotes the expectation with respect to x. 

II. Polar Decomposition of Mutual Information 

Consider a channel with complex-valued input X and output Y, which could be expressed in 
a polar-coordinate system that 

X = X,| • exp(jXz), X,| G [0, +00), X z G [-7T, tt), (3) 

and 

Y = Y r exp(jT z ), Yj| G [0, +00), F z G [-tt, tt), (4) 

where Xn and Yu denote the amplitudes of the X and Y, respectively, and X z and Y/ denote 
their corresponding phases. Based on the chain rule of mutual information [1, Theorem 2.5.2, 
Page 24], we have 

I(X;Y)=I(X 11 ,X Z ;Y ]] ,YJ 

= I(X ll ;Y) + I(X z ;Y\X ll ). 
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We focus on a special input case whose amplitude and phase are independent of each other, 
e.g., for standard complex- valued Gaussian inputs [2], or product-APSK inputs Lj. When X\\ is 
independent of Xz, we have h(Xz\Xn) = h(X/) for a continuous Xz, or H(Xz\Xn) = H(X/) 
for a discrete Xz- Nonetheless, by assuming Xz is continuous without loss of generality, we 
have 

I(X Z ; Y\X\\) = HXzlX^) - h(Xz\X lb Y) 

= h{Xz) - h(X z \Y) + h(X z \Y) - h{X z \X lb Y) (6) 

= I(Xz;Y) + I(X ]l ;Xz\Y). 

Therefore, by applying © to © we get the decomposition that 

I{X;Y)= I{X n ;Y) + I(X Z ;Y) + I(X\\;X Z \Y) (7) 

Amplitude term Phase term Cross term 

when Xu and Xz are independent of each other. Please note that our decomposition © is 
different from Goebel's [3, Equ. (3)] that our phase term I(Xz',Y) is independent of the 
amplitude of the input signal, and thus we have a nice symmetric expression. 

The polar mutual information decomposition © is helpful in understanding the characteristic 
of channels with the input whose amplitude and phase are independent of each other. Tradi- 
tionally, for square QAM inputs we decompose the channel into two independent / and Q 
sub-channels in order to simplify the detection complexity, when the two conditions shown in 
Section Q] are satisfied. However, product-APSK inputs clearly violate condition 1 . Fortunately, 
by using our polar decomposition ©, we can approximately decompose the channel C : X i— y Y 
into two sub-channels, i.e. the amplitude sub-channel C\\ : X\\ (->• Y, and the phase sub-channel 
Cz '■ Xz i-> Y, since we will illustrate that the cross term I(Xu;Xz\Y) is negligible. This 
channel decomposition helps us to simplify the product-APSK demapper in CM schemes. 

We now apply the decomposition © to the complex-valued AWGN channel 

Y = X + W, (8) 

where Y denotes the output signal, X denotes the input signal with power constraint that 
E[|X| 2 ] = E s , and W denotes the Gaussian noise with zero mean and variance of N that 
W ~ CN(0, N ). The signal-to-noise ratio (SNR) is defined as 

SNR = E s /N . (9) 

'The amplitude and phase of a product-APSK input are independent of each other, since we could verify that 
Pxn,x^(x\\,X^) — Pxji(iK||) ■ Px z (xz), see Section ITVl for detail. 
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III. Gaussian Inputs 

For a complex-valued circularly symmetric Gaussian input that X ~ CN(0, E s ), we derive 
the expression for each term in (|7). Our expression of the amplitude term is quite similar to that 
in 0, except that we derive its lower bound while an approximation was presented in J3]. 

A. The Amplitude Term 

We write the amplitude term I{X\\] Y) as 

J(X||;F)=/(X||;y]|) + /(X||;F z |y]|) 

(a) (10) 

= ^n;>ii) 

by using the chain rule of mutual information [fl] Theorem 2.5.2, Page 24], wherein (a) follows 
from the fact that for a complex-valued Gaussian input X, the output Y is also complex- valued 
Gaussian distributed, and therefore the phase Y/ is uniformly distributed within [— ir, it) no matter 
given the amplitude or not, that is p Yz \Y n (yz\y\\) = PY z \x n ,Y l} (yz\x\\,y\\) = VyAvz) = V( 27r ) for 

yz e [-7r,7r) and outside, so that we have I(X\\;Y Z \Y\\) = h(Y/\Y\\) - h(Y z \X\\,Y\\) = 0. 
As shown in [3], I(X\\; Y\\) can be expressed as 



where fl3j 



and 



I{X\\;Y\\) = h(Y\\)-h(Y\\\X\\), (11) 



h(Y\\) = l - \og 2 {E s + N ) + (1 + 7 /2) log 2 e - 1, (12) 



h ( Y \\\ X \\) = ~ / Px ll (x\\)pY ll \x n (y\\\x\\)\og 2 p Yn \x n (y\\\x\\)dx\\dy\\. (13) 

Here in (fT2l) . 7 fa 0.5772 denotes the Euler constant, and the conditional PDF pY,,\x,Xy\\\x\\) 
follows a Rice distribution that [2, Page 46] 

/ , x 23/11 / ^n+l/iiA Cte\\y\\\ 

^^^-f„-^[- A Nr)' h {^)' (14) 

where J (-) denotes the modified Bessel function of the first kind with order zero. 

Clearly (fT3l) does not have a closed form expression, and an approximation was derived in []3). 
In this correspondence, we determine its lower bound. We commence by determining the bound 
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of the conditional variance of Y\\ given X\\. We have the first moment of Yj| given X\\ = x\\ as 

POD 

E l Y \\\ x \\ = x \\\ = / 3/||Pyj|i*ii(viihi) d 2/|i 



71 



■i'T 



2JN< 



exp 



o 



2N n 



(xl + N )I 



XT 



2N 



+ XuI% 



■i'T 



2N n 



(15) 



where Ix(-) represents the modified Bessel function of the first kind with order one. We have 
the second moment of Y\\ given X\\ = x\\ as 



E[yjf|x N 



:ri 



l/M^ilXnd/llkll)^! 



(16) 



= iV H-xjj. 
Therefore, the variance of Yji given Xi = a?n can be evaluated as 

Var[Yj||X|| = ,T||] = E[lf |X|, = *„] - (E^X,, = x,,]) 2 

= N (l + A - | exp(-A) [(1 + A)/ (A/2) + AIi(A/2)] 2 ), (17) 

V v ' 

/(A) 

where A = x 2 ,/A^ . As shown in Appendix IA1 we have /(A) < 1/2, and accordingly we have 

Var[Y[||X|| = x\\] < N /2. (18) 

For a R.V. with a limited variance, the Gaussian distribution maximizes the differential entropy [JT} 
p.411, Example 12.2.1] that 



h(Y {l \X {l =x\\)< h(N(0,N Q /2)) = -\og 2 (ireN ) 



\/x\\ G [0, oo). Now, we have 



h(Y\\\X\\) = E X]] h(Y\\\X\\ = x„) < -log 2 (vreiV ). 



(19) 



(20) 



Consequently by applying (l20l) . (fT2)) . and (fTT|) into (flOl) . we have the lower bound of I(X\\; Y) 
that 

1 + 7 i lo §2 I" 

l( (!_'.-> I ! -1 I 1 lOffo ft — 



I(X ll ;Y) = I(X ll ;Y ll )>Uog 2 (l + ^)^ 



iV n 



log 2 e 



1. 



(21) 



i-0.69 



For a very high SNR, since we have A = zf\/N ->■ oo, and 7 (A/2) « exp(A/2)/V7rA OS 
p.377 9.7.1], it shows that (|2~TT) is also a good approximation [3], in other words, the lower bound 
(1211) is tight at high SNR. In fact, we can also show that as A — > oo, we have /(A) — > 1/2, so 
that the variance of Y\\ given X\\ approaches N /2, see Appendix lAl for the proof. 
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B. The Phase Term 

The phase term I(X Z ; Y) can be written as 

I{X Z ;Y) = I{X Z ,Y\\) + I{X Z ;Y Z \Y\ 

(a) 



(22) 



I(X Z ;Y Z \Y {1 ), 

where (a) follows from the fact that the output's amplitude Yu is independent of the input's 
phase X z so that we have I(X Z ;Y\\) = 0. It is notable that our phase term is different from 
the one in J31, wherein it is conditioned on the amplitude of the input signal. Our conditional 
mutual information I(X Z ; V^llfi) can be evaluated as 



I(X z ;Y z \Yn) = h(Y z \Yn) - h{Y z \X z ,Yn). 



(23) 



As the output signal Y is complex- valued Gaussian distributed that Y ~ CN(0, E s + N ), the 
angle Y z is uniformly distributed within [— n,n) meanwhile being independent of Y\\. Thereby, 
we have 

h(Y z \Y {{ ) = h(Y z ) = log 2 (27r). (24) 

Since h(Y z \Yu,X z ) is unaffected by the constant phase shift X z , we assume X z = without 
loss of generality, and accordingly, we have 

PYt\Y\ ]r xAvAv\b x * = Q)pY U \xAy\\\ x ^ = o) =PY n ,Yt\xAv\uvA x * = °) 
pxu{x\\)pyu, Y Z \x u ,xAy\\iyA x \\i x ^ = Q ) dx w 



(25) 



E„ GXP \ E. ) ttX 



exp 



x l + y f\ - 2x \\y\\ cos y^ 



dx\\. 



Moreover, since Y\\ is independent of X z , the conditional PDF pY u \x z (y\\\x z = 0) = Py u (v\ 
also follows a Rayleigh distribution. Therefore, we have 

,2 



PY Z \Y U ,xAvAy\b x ^ = o) = ^ ex P 



V\\ 



+ 



V\\ cos y z 



2^7rNo(l + v) 



cxp 



N (l + V) 



yf } sin 2 y z 



1+Erf 



y\\cosy z 



(26) 



N (l + V) 

where r\ = N /E s denotes the inverse of the SNR, and the error function Erf (x) is defined as 

2 rx 



Erf(x) 



For a high SNR we have yfJN — ¥ oo, r\ — >• 0, and ?/ z 



exp(-r)dt. (27) 

x z = 0, and thereby (l26l) can be 
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approximated as 



PY Z \Y U ,xAyAy\b x ^ = °) ~ , = ex P I - M j 2 1 • ( 28 ) 



^o/yj| " l V N °/V\\. 



Since p(y z \y\\,x z = 0) tends to be the PDF of a real-valued Gaussian distribution with zero 
mean and variance of N /(2y?,), we obtain 

h(Y z \X z , Y\\ =y\\) = h(Y z \X z =0,Y\\ = y N ) 

1 ( N \ (29) 

By taking the expectation with respect to y\\ we have 

POO 

h(Y z \X z ,Y\\)= / p Yu (y\\)h(Y z \X z = 0,Y ll =y ll )dy ll 
Jo 



oo 



,2 



22/11 / y \\ \ K N 



exp — — — — • - log 2 vre ■ -=- dj/n (30) 



£ s + iV r ^ E S + N J 2 oz y 
1, ( N \ 1+7 1 



Applying (HI and <|24|) to Q3$ yields 

J(X Z ; y z |Yj[) = h(Y z \Y {{ ) - h{Y z \X z ,Y\\) 

1 / E a \ 1 + 7 1 (3!) 

"2 10g2 l 1 + iVoJ-^- 10g2e+ 2 10g27r + L 

Based on the lower bound of I{X\\\Y) shown in (f2~Tj) . the decomposition ©, and the fact 
that the channel capacity of an AWGN channel is achieved as I(X; Y) = log 2 (l + E s /N ) with 
a Gaussian input, it is clear that (l3TT ) is an upper bound of I(X Z ; Y Z \Y\\) 

I(X Z ;Y) = I{X Z] Y Z \Y\\) < ilog 2 (l + ^A -i±^log 2 e+ Ilog 2 7r + 1. (32) 

f»0.69 

This upper bound is also tight at a high SNR shown in (l3TI) . 

C. The Cross Term 

The cross term I(X\\;X Z \Y) = I(Xu;Xz\Yu,Yz) could be calculated as 

PX||,X z |Y||,Y z (£||,£z|2/||,2/z) 



I(X\\;Xt\Y\\, Y z ) = E x XZty yz log 2 



PXu\Yn,YA x \\\V\\>V*)PXt\Y n ,YA x AV\\)Vt) 



(33) 
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We have the PDF pxu,Xz,Y n ,Yz( x \\i x z,y\\,yz) expressed as 

Pxn,Xz,Y n ,YA x \h x ^y\byA) ^ Pxn,xA x \\-> x A>PY u ,Yz\x u ,x z (y\\,yz\ x \\, x^) 



X\ 



XT 



ttE, 6XP i ~E S 



TiNc 



exp 



x^ + y^- 2x\\y\\ cos(y z - x z ] 

iVn 



(34) 



Then, we have the conditional PDF px n ,x z \Y u ,Y z ( x \\i x Ay\\iV^) as 



Px^xAY\YA x \h x Ay\bys-) = P , r X||exp 



^ ^S ^J +^-2x||y||Cos(y z -xz) 



£ s + iV £ s 



iVo 



(35) 

Now, the conditional PDFs Pxn\Y n ,Yz( x \\\y\by^) an d Px z \Y u ,Y z (xz\y\\,yz), respectively, can be 
obtained as 



Px u \Y u ,Y Z { x \\\y\\,yz) 



px n ,x z \Yu,Yz( x \\' x Ay\by/-) dx ^ 



2x||(l + 77) 



exp 



[1 + »7) 2 xjj + y, 
N (l + v) 



h 



2x \\y\\ 



(36) 



and 



Pxz\Y U ,Yz( x Ay\by^) 



PXn,Xz\Yu,Yz( x \\, x z\y\\,yA)dx\\ 



— exp 



^ 



?/||COs(?/ Z -X Z ) 

N (1 + T})J 2^nN (l + V ) 



+ 



(37) 



x exp 



yjjsin 2 (?/ z -a: z ) 
iVo(l+^7) 



1+Erf 



T/||CQs(?/ Z -X Z ) 

ViV (l + r/) 



where 77 = N /E s denotes the inverse of the SNR. It is interesting that the conditional PDF 
PXz\y 1{ ,Yz('\'i ') shown in (1371) is the same as the conditional PDF Py z \YuX z (-\-, •) shown in (|26l) . 
because the known angle solely affects the centroid. 

By applying (l35l) . (|36l) . and (1371) to (1331) . the cross mutual information I(X\\; X z |Yf|, F z ) can be 
obtained accordingly. However, we do not have a closed-form expression for I{X\\; Xz\Y\\, F z ). 
In the following, we discuss two limiting cases either for a very low or a very high SNR. 
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For a very low SNR that iVo — > oo, we have 



PX n ,X4Y n ,YA x \b x AV\\>U*) -* 



X\ 



vrE, eXP E„ 



2xi 



pxu\Y n ,YA x \\\y\\>y*) ^~E^ ex P 



Px z \Y U ,Yz( x mhy^) -^ 




(38) 

(39) 
(40) 



Therefore, even given Y\\ and Yz, the amplitude X\\ still tends to be Rayleigh distributed, and 
the phase Xz tend to be a uniform distribution, and they both tend to be independent of each 
other, thus we have I(X\\\ Xz\Y\\, Y/) — ^ for a very low SNR. 

For a very large SNR that Nq —> 0, we have xn — > y\\ and Xz — > yz- Thereby, we have 



Px n ,Xz\Y n ,Yz(x\\,xz\y\\,y/.) ->■ 



Pxn\Yn,Yz( x w\y\byA) -► 



pxAYnYA x Ay\hVz) -+ 



1 
1 



cxp 



cxp 



y x \\ -y\\ 



No 



( x w -y\ 



i 



7riVo/yf 



exp 



{xz - yzf 

No/yl 



\-2l 



nNo/yf 



:exp 



No 

(xz - yz) 2 

No/yl 



,(41) 

(42) 
(43) 



for iVo — ¥ 0. In this case, it is interesting that both Xu and Xz tend to be Gaussian distributed 
given Y[| = y\\ and F z = y z , i.e., X\\ ~ iV(y| h iV /2) and X z ~ N(yz,N /{2yffl. Moreover, 
X|| and Xz also tend to be independent of each other even given Yji and 1/, thus we have 
I{X\\,Xz\Y\\,Yz) ->• at a very high SNR. 

For these two limiting cases, we can also explain the cross terms physically as follows. For a 
very low SNR we have I(X; Y) — > 0, and therefore the cross term I{X\\; Xz\Y) also tends to 
because I(X\\;Xz\Y) < I(X;Y) according to the decomposition ©. Alternatively, for a very 
noisy channel, knowing the output Y provides little information about the input X, so that we 
have/(X||;X z |y) -> I(X {{ ; X z ) =0. 

For a very high SNR, we have Y -> X, and therefore we have I{X\\]Xz\Y) -4 I{X\\\ X Z \X) = 
0. Furthermore, the lower bound of I(X\\; Y) in (|2T|) and the upper bound of I{Xz\ Y) in d32l) are 
both tight at a high SNR, and it can be observed that I(X\\;Y) + I(Xz] Y) -4 log 2 (l+.E s /iV ) = 
J(X; F). Therefore, it also shows in this way that I{X\\\ X/\Y) -4 at a very high SNR. 

IV. Product- APS K Inputs 

Having discussed the decomposition for Gaussian inputs, let us investigate the product-APSK 
inputs. An (M = 2 mz x 2 m n)-ary product APSK constellation consists of 2 m n rings, wherein 
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Fig. 1: Product- APSK constellations, where the radii are determined according to (145T ). 
(a) Product-64APSK, and (b) Product-256APSK. 



each ring possesses 2 mz uniformly distributed points. The product-APSK constellation signal set 
X is described by 



X = {r q expOVp) : P e {0, • • • , 2 mz - 1}; q G {0, 



!}}> 



(44) 



where cp p = 7^7 (2p + 1) denotes the p-th phase-shift, and radius of the g-th ring r q is recom- 
mended to be JH 

r q = sJ-\og[l-(q + l/2)-2- m u}. (45) 

This radius r q is determined by letting the probability that a standard complex-valued Gaussian 
R.V. is within the g-th ring equal to the probability that the product-APSK signal is within the 
g-th ring, where half the points on the g-th ring of the product-APSK are taken into account 
as within the g-th ring. Such radius is quite similar to that for nonuniform PAM design |[T5l . 
or ring constellation design [21 1 Equ. (83)], whereby the ring constellation consists of several 
rings each with a uniform phase within [—it, vr). 
For the parameter pair (mu,m/), we have ifTOll 



(46) 



m/_ = m/2 + 1, m\\ = m/2 — 1, for an even m, 

rriz = (m + l)/2, m\\ = [m — l)/2, for an odd m. 

In ifTOl . we determined such pair by maximizing the Harmonic mean of the Euclidean distance. 
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Nonetheless, we could also interpret this assignment according to the decomposition of mutual 
information in this correspondence. As shown in (|2TT ) and (|32l) . we have I(Xz', Y) ~ I (X^) + 1.38 
(in bits/channel use) for Gaussian inputs at a high SNR. To make the product-APSK more like 
Gaussian behavior, we shall have mz > mu and the gap between them is around 1.38. However, 
since mz and m\\ are both integers, the choice (|46) is reasonable. 

It is interesting to note the (2 mz x 2 m n)-APSK constellation signal set X can be regarded as 
the product of the 2 mz -PSK set V = {exp(J(p p )} and the pseudo 2 m H-PAM set A = {r q }, i.e. 
X = V x A. Additionally, we define the set of the phases as V z = {(f p }. 

Furthermore, based on the product-APSK set, we have the product labeling function \x : b i-> 
x E X, where b denotes an m-bit vector. The function fi consists of an amplitude-related labeling 
/i|| : b|| i-)- X|| G A, and a phase-related labeling \iz '■ b/ \-+ xz £ V z , where b|| denotes an 
mn-bit sub-vector of b and b/ denotes the rest m^-bit sub-vector, and we have x = xn exp(jxz). 

According to the above product-APSK constellation labeling, some bits are only relevant to 
the amplitude of the input signal, and others are only relevant to the phase. Moreover, our 
recent experiments show that the demapper's complexity could be reduced from the order of 
0(2 m ) to (9(2 m n +2 mz ) with a negligible performance loss |0J. From an information-theoretical 
perspective, this is because the channel is able to be decomposed into an amplitude sub-channel 
and a phase sub-channel with a negligible information loss, as detailed below. 

We first calculate the mutual information between the product-APSK input and its correspond- 
ing output over AWGN channels. The mutual information I(X ; Y), with the input X taking on 
an M-ary constellation X with equal probability, can be evaluated as fl22] 

'>Zx&xPY\x(y\x) 



I(X;Y) = \og 2 M-E X:y \og 



Py\x(v\x) 

log 2 M - — Y^ E„, log 2 J^ ex P ( 



\x — x + w\ 2 — \w\ 2 



x€X 



(47) 



x£X 

where w denotes the realization of the complex- valued Gaussian noise W with zero mean and 
variance of N . The average symbol energy E s of the input signal X constrained by the product- 
APSK constellation X is determined as 

2 m N-l 
q=0 



It is clear that 



Px n ,xA x \\i x ^) = P X\\( X \\) ■ Px z ( x d = TT^AP Z ( X II' X ^) ( 49 ) 



for product-APSK inputs, where 8j^pz{x\\,xz) = 1 if x\\ G A,xz G V z , and otherwise. 
Therefore, the amplitude and phase are independent of each other for product-APSK input, and 
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we also have the polar decomposition ©. 

A. The Amplitude Term 

For an (M = 2 mz x 2 m n)-ary product-APSK input X, the amplitude term I(X\\;Y) can be 
evaluated as 



I{X\\;Y)=E x , y \og 2 



PY\x n {y\x\ 

. Pr(y) 



mil H-E.,.,,10 



x\\,y ■ 



m\ 



> E„, loe 



^2py\x(v\ 

xeX 

y ^exp(— \x — x + w\ 2 /N ) 



Y^ py\x(y\ c 



x(^X ,xu=x\\ 



X 



x£X 



(50) 



5^ exp(-\x-x + w\ 2 /N ) 

X(zX,xn=Xu 

Furthermore, using the chain rule of mutual information, the amplitude term I(Xii\ Y) can be 
written as 

I(X ll ;Y) = I(X ]] ;Y ]] )+I(X ll ;Y z \Y ll ). (51) 

However, unlike the Gaussian input where we have I(Xu] Yz\Yu) = shown in (TTOl) . the term 
I(X\\; Yz\Y\\) usually does not equal to for product-APSK inputs, see Appendix[Bl Nevertheless, 
at a very high SNR that N — > 0, we have Y — > X so that Yu — >■ Xu, and accordingly we have 
I{X\\]Y Z \Y\\) ->0, and 

I(X\\;Y) « /(X N ; Yj|) « if (X,,) = m N . (52) 

In addition, when we have plenty of points on each ring of the product-APSK constellation, 
we would also have I(X\\;Y/\Y\\) — V 0, see Appendix |B] for the proof. Therefore, we have the 
approximation I(X\\\ Y\\) pa I(X\\\ Y), which suggests that when demapping the bits that are only 
relevant to the amplitude of the transmitted signal X, we can neglect the phase of the received 
signal Y while only use Yu instead [|4]|. 
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B. The Phase Term 

Similarly, the phase term I(X Z ; Y) can be calculated as 



I(X Z ;Y) =E XZ;2/ log : 



Py\xAvM_ 
Py(v) 



m z + E X „log ; 



Y py\x(v\ 



X 



x£X,x J /=x z 



J2py\x(v\ 



X 



xex 



m z 



"mE 1 - 1 ^ 



xex 



y ^exp(— \x — x + w\ 2 /N ) 
xex 

/ exp(— \x — x + w\ 2 /N ) 

.xeX,x^=xz_ 



(53) 



By using the chain rule of mutual information again, 1{X Z \ Y) can be written as 

I(X Z ; Y) = I(X Z ; Y\\) + I(X Z ; Y Z \Y\\) 

(a) 



(54) 



I(X Z ;Y Z \Y {1 ), 

wherein similar to the case of Gaussian inputs, (a) follows from the fact that the output amplitude 
Yjl is independent of the input phase X z . By the way, at a very high SNR, we have Y z — > X z 
and the following approximation 



I(X Z ; Y) = I(X Z ; Y z \Y n ) « H(X Z ) = m z . 

C. The Cross Term 

The cross term I(Xu; X Z \Y) for a product- APSK input can be evaluated as 

Px\y(x\y) 
r x \YKX\y) iog 2 — — 

\ D 

xGX 



(55) 



I(X l]; X z \Y) =E y J2Px\Y(x\y)log 2 



x£X,x\\=x 



-jfi Y Y E w P x \ Y (x\x + w) log. 



xex xex 



Y Px\v{x\y) Y p x\v^\y) 

\=x\i x£X,x^=x^ 

P x \y(x\x + w) 
Y Px\y(x\x + w) Y Px\y{x\x + w) 



(56) 



x(^X ,x\\=x\\ 



xeX ,x^=xz_ 



where P x \y{x\x + w) is expressed as 



P x \ Y (x\x + w) 



exp(— \x — x + w\ 2 /N ) 
y exp(— \x — x + w\ 2 /N ) 



(57) 



xex 



For a very high SNR, we have iVo — > 0, w — > 0, and accordingly 



P x \y(x\x + w) 



1 X = X 

otherwise. 



(58) 
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Fig. 2: Polar decomposed terms of mutual information as a function of SNR for AWGN channels 
with Gaussian inputs. The lower bound of the amplitude term I(X\c Y) shown in (|2~TT) and the 
upper bound of the phase term I(Xz, Y) shown in (|32l) are also depicted. 



Therefore, it is clear that I{X\\; Xz\Y) t=a 0. In addition, since we have I(X; Y) — > mz + m\\, 
I{X {{ ; Y) -)• m||, I(X Z ; Y) -> m z , and I{X; Y) = I(X\\; Y) + I(X Z ; Y) + I(X\\; X Z \Y), we 
also have the limit that for a very high SNR 



J(X N ;X z |Y)->0. 



(59) 



V. Numeric Results 

A. Results of Gaussian Inputs 

We now present the results of the decomposed terms of mutual information. The results for 
AWGN channels with Gaussian inputs are shown in Fig. |2l wherein the notation AMI denotes 
the average mutual information. The lower bound of the amplitude term I{X\\; Y) in (|2T|) and 
the upper bound of the phase term I(X Z ; Y) in (|32l) are also depicted. It shows that these two 
bounds are both tight for a SNR higher than 12 dB. The cross term I(Xu;Xz\Y) reaches its 
maximum value of about 0.08 bits/channel use at SNR & 1 dB, and it tends to be zero at a 
high SNR. Moreover, the cross term is negligible compared to the amplitude or the phase term, 
which indicates that the AWGN channel can be decomposed into an amplitude sub-channel and 
a phase sub-channel with a negligible information loss for Gaussian inputs. For example, the 
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Fig. 3: Polar decomposed terms of mutual information as a function of SNR for AWGN channels 



with 64APSK inputs depicted in Fig. |l(a)| , the AMI associated with 64QAM inputs is also 
depicted for reference to illustrate the shaping gain obtained by product-APSK. 



gap between I(X;Y) and I(X\\;Y) + I(X Z ;Y), that is, the cross term I{X\\;X/\Y), is less 
than 0.04 bits/channel use at a SNR over 12 dB, and less than 0.02 bits/channel use at a SNR 
over 15 dB. In other words, for mutual information higher than 4 bits/channel use, the loss is 
less than 0.1 dB. For a clearer observation of the cross term, please refer to Fig. |4] 

B. Results of Product-APSK Inputs 

We take (16 x 4 = 64)-APSK as an example. The constellation is illustrated in Fig.|l(a)[ with 



the radii given by (|43T) . The decomposition results are presented in Fig. [3] These results are quite 
similar to the Gaussian-input case except that the amplitude term I(Xu;Y) is upper-bounded 
by m\\, and the phase term I(X^;Y) is upper-bounded by m/, at high SNRs. The cross term 
7(X||; X/|Y) is negligible. For instance, for coding rates higher than 1/2, namely, for the mutual 
information is higher than 3 bits/channel use for 64APSK inputs, such loss is about 0.1 dB. In 
addition, the AMI I(X; Y) associated with 64QAM input is also depicted for reference. Fig. [3] 
clearly shows that even when decomposition is used, product-64APSK still outperforms 64QAM 
at code rates of usual interests such as 1/2 or 2/3. For example, about 0.6 dB shaping gain can 
be obtained at the code rate of 2/3. 

We collect the results of the cross term associated with three input cases together in Fig. |4j 
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Fig. 4: The cross term of decompositions as a function of SNR for AWGN channels with 
Gaussian inputs, and the product-APSK inputs depicted in Fig. |l(a)| and Fig. |l(b)[ 



including Gaussian, 64APSK and 256APSK. It is interesting that all of they reach their maximum 
value at the SNR w 1 dB. Moreover, it shows that the cross term increases with the constellation 
order, and intuitively, the curve associated with Gaussian inputs may be the limit for product- 
APSK inputs when the constellation order goes to infinity. 

Although we only examined two kinds of inputs, namely, the Gaussian input and the product- 
APSK input. This decomposition is applicable to other inputs with independent amplitude and 
phase, such as PSK and phase-modulated [3, Sec.III(b)] which is named as ring constellation 
in ||2T|. Indeed, PSK can be regarded as a degradation of APSK that consists of a single ring, 
while the ring constellation is a limiting case of APSK that possesses infinite points on each 
ring. Therefore, our decomposition is also applicable to these inputs as a simple extension. 

VI. Conclusions 

We have proposed a novel polar decomposition of mutual information for complex-valued 
channels with an input whose amplitude and phase are independent of each other. Using this 
decomposition, the mutual information between the channel's input and output is symmetrically 
decomposed into three terms: an amplitude term, a phase term, and a cross term, where the cross 
term is negligible, based on which the channel can be approximately decomposed into two sub- 
channels associated with amplitude and phase, respectively. This decomposition is then performed 
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for AWGN channels with Gaussian and product-APSK inputs. For Gaussian inputs, theoretical 
bounds are derived. For product-APSK inputs, the decomposition is helpful to facilitate the 
design of product-APSK, and directly leads to a simplified demapper. This establishes a solid 
information theoretical foundation for coded modulation schemes using product-APSK, that 
is, better performance can be achieved by product-APSK over QAM while the complexity is 
maintained low. 

Appendix 
A. Proof of f(X)< 1/2 

We have the definition of /(A) as 

/(A) = 1 + A - - exp(-A)[(l + A)/ (A/2) + A/i(A/2)] 2 

n (60) 

= l + X--[L 1/2 (-X)] 2 , 

where Li/ 2 (x) = exp(x/2)[(l — x)J (— x/2) — xI\{—x/2)\ denotes the Laguerre polynomial with 
the order of 1/2. We have [20, 9.6.10, Page 375] 

where T(-) denotes the Gamma function and T(n + 1) = n\ for a positive integer n. Therefore, 
for a positive x, we have 

~ ( x /2) 2k x 2 



and 



and 



^ (x/2) 2 ^ x 

W = g ki(kTTy. > 2 • (63) 

Thereby, we consequently have 

L 1/2 (-A) > exp(-A/2) (l + A + ^\A , (64) 

/ (A)<i + A-^ex P (-A)('l + A + ^A 2 ') . (65) 

It is easy to verify that the right side of (1631 ) is an monotone increasing function with A. Therefore, 

for A G [0, 1], we have 

/(A) < /(l) = 2 - j^ « 0.455 < 1/2, VAe[0,l]. (66) 
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For A > 1, we have the series associated with the modified Bessel function of the first kind 
with order v as [20, 9.7.1, Page 377] 



IJx) 



V2 



TXX 



u-\ (u-l)(u-9) (u-l)(u-9)(u-25) 



vr 



2!(8x) 2 



3!(8a;) 3 



for a large x where u = 4t> 2 . Therefore, we have 
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e A/2 
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\ , f nii(2fc-i) 2 fi 



7T 
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oo 
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n\A n 

r 2 (n + i/2) n 

nn\ V A 
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3 A/2 
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A 



3 a/2 
T^A 



n=l 

00 



(-i) w nLi[4-(2fc-i) 2 ] /i 

n!4 ra \X 

r(n + 3/2)r(n-l/2) /l\ n ' 



n=l 



7172! 



Thereby, we have 



L 1/2 (-\) 



~K, 



2A + 



^ , (i + n)r 2 (n + i) + r 2 (n + |; 



n=l 

1 



r( n +|)r(n + §: 



7r(n + l)! 



AA 

1 / 



2A +9 + E / 



2 ' ^ V27r(n + 1)! 



1 



v 7 ^ V 2v/A 

In the above proof, we have used the Gamma function that 

r(n + i/2) = nk|^l)r(i/2) 



and T(l/2) = y/n. Now we may write that for A > 1, we have 

2 
< 1/2, VAe(l,oo) 



/(AXU-A-^i^ 



1 



(67) 



(68) 



(69) 



(70) 



(71) 



(72) 



Based on (l66l) and d72l . we may have /(A) < 1/2, VA > 0. For an intuitive imagination of 
the above proof, we provide the numeric results of Li/ 2 (— A) and its approximations (l64l and 
(|7Q|) in Fig. |5j In addition, /(A) as a function of A is plotted in Fig. [6j 
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Fig. 6: Numeric results of /(A) as a function of A. 
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B. Proof of I(X\\ ; Y z \Y\\) ^ for product- APSK inputs 

To show I{X\\\ Y/\Y\\) 7^ 0, it is equivalent to show that p Yz \Xn,Yn(yz\x\\,y\\) ^ PY z \Y u (yz 
It is also equivalent that pY z \Xn,Y,Xyz\x\\,y\\) is relevant to x\\. In fact, we have 

PXn,X A7 Yn,YA X \h X ^y\by<) = P X n ,Xz( x \U X ^)PY U ,Y z \Xu,X z {y\\,yA x \h x A) 



PXn,X z {X\\,Xz) 



y\\ 

ttNc 



exp 



x \ + V\\ - ^ X \\V\\ cos(y z - x z ) 

iVn 



The summation of Px\,,x z ,y [ <,y z {x\\, x z-,y\\,yz) with respect to xz yields 



px n ,Y\\,YAz\\>y\\>y*) =mJ2 



M ^ ttN 

x z 



exp 



x i + y l ~ 2x \\y\\ cos (y^ - x A) 



5 A (x\\ 



(73) 



(74) 



where 8 A (x\\) = 1 if x\\ E A, and otherwise. A denotes the set of the amplitudes of our product- 
APSK defined in Section HVl Furthermore, by integrating px {l ,Y\\,Y z (x\\,y\\,yz) with respect to 
yz yields 



Px, U Y,,{x\\,y\\) 



2 m n N n 



Xu + 



exp 



■?*ffl« 



X\\ 



(75) 



Now we have 



PY z \x u ,YAyz\x\\,y\\) 



PY Z ,x n ,Y U (y^^ x \\^y\\) 
Px u ,Y n ( x \hy\\) 



m z Z-^ 



Om z 



1 exp [2x\\y\\ cos(y z - xz)/N ] 
2^ / (2x||yn/iVo) 



(76) 



6 A {x\\). 



It is clear that PYz\Xn,Yu(yz\x\\, V\\) is relevant to X\\ based on (1761 ), and thereby usually we have 
I(X\\-Yz\Y\\) ^ for a product-APSK input. 

However, at a very high SNR with iVo — V 0, we yz — > Xz, y\\ — > x\\, and by using the 
approximation that Iq(z) ~ exp(z)/\/27cz, ((76b can be approximated as 



PY Z \x u ,Y l} {y^ x \\>y\\ 



m z Z-^i I 7T 



2m z 



x z 



irNo/xl 



■. exp 



(yz ~ xzf 
N /x?, 



S A {xn) 



\\>- 



(77) 



That is, the PDF of Yz given X\\ = x\\ G A and Y\\ is a uniformly weighted combination of a 
serial Gaussian distribution with the mean of yz £ "P z (defined in Section HVl). and the variance of 



-/Vo/(2xm). Furthermore, since y\\ — > x\\ at a very high SNR, we also have PY z \Xu,Yu(yz\x\\,y\\) ~ 
PYz\Y n (yAy\\) ~Py z \xAvA x \\)- Therefore, we have I(X\$X/\Y) -»- 0. 

In addition, when we have lots of points on each ring that mz — > oo, (1761 could be simplified 
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as 



PYt\x n ,Yn(vAx\\,v\\) = i~ ■ o„uZ21/n„\ J2^7 ex ^ 2x \\y\\ co < x ^ -y^)/ N o] 
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oa{ 


x \\) 


2^ 


2ttI (2x 


\\y\\/N ) 
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Sa( 


X\\) 


2^ 


2ttI (2x 


\\y\\/N ) 


2vr 


U0||)- 





Ortiz 



/7T 
exp[2x\\y\\cos(xz — yz)/N ]dxz ^ ' 

-TT 



Thereby, the phase of the output signal would be uniformly distributed within [— n, n), mean- 
while being independent of its magnitude, for product- APSK inputs when the number of points 
on each ring tends to infinity (typically, when the constellation order tends to infinity). Intuitively, 
(|78T) is reasonable, because when the number of points on each ring tends to infinity, the phase 
of input signal would also tends to be uniformly distributed within [— tt, tt). In this case, we 
would have I(X\\; Yz\Y\\) —> which is quite similar to the Gaussian inputs. 
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